Mach1 for Nonuniform Time-scale Modification of Speech: Theory, Technique, and Comparisons

نویسندگان

  • Michele Covell
  • Margaret Withgott
  • Malcolm Slaney
چکیده

We propose a new approach to nonuniform time compression, called Mach1, designed to mimic the natural timing of fast speech. At identical overall compression rates, listener comprehension for Mach1-compressed speech increased between 5 and 31 percentage points 2 over that for linearly compressed speech, and response times dropped by 15%. For rates between 2.5 and 4.2 times real time, there was no significant comprehension loss with increasing Mach1 compression rates. In A–B preference tests, Mach1-compressed speech was chosen 95% of the time. This paper describes the Mach1 technique and our listener-test results. Audio examples can be found on http://www.interval.com/papers/ 1997-061/. The research described in this paper is the basis for our submission to the 1998 International Conference on Acoustics, Speech, and Signal Processing. The description provided here is a longer and more complete description of our approach and our results than we could fit into the ICASSP paper format. However, since our ICASSP submission is effectively a subset of that description, we have included the IEEE copyright notice below. Interval Research Corporation Technical Report # 1997-061 Copyright 1998 IEEE. Published in the Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, May 12-15, 1998. Seattle, Washington. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MACH1: nonuniform time-scale modification of speech

Time-compression techniques change the playback rate of speech without introducing pitch artifacts. However, when linear-compression techniques are used, human comprehension of time-compressed speech typically degrades at compression rates above two times real time [1]. These degradations are not due to the speech rate per se: Comprehension of linearly compressed speech often breaks down above ...

متن کامل

Free Vibration Analysis of Nonuniform Microbeams Based on Modified Couple Stress Theory: an Analytical Solution

In this study, analytical solution is presented to calculate the free vibration frequencies of nonuniform microbeams. Scale effects are modelled using modified couple stress theory and the microbeam is assumed to be thin while Poisson's ratio effects are also taken into account. Nonuniformity is presented by exponentially varying width among the microbeam while the thickness remains constant. R...

متن کامل

Modification of Audible and Visual Speech

Speech is one of the most common and richest methods that people use to communicate with one another. Our facility with this communication form makes speech a good interface for communicating with or via computers. At the same time, our familiarity with speech makes it difficult to generate synthetic but naturalsounding speech and synthetic but natural-looking lip-synced faces. One way to reduc...

متن کامل

An Overlap-add Technique Based on Waveform Similarity (wsola) for High Quality Time-scale Modification of Speech

A concept of waveform similarity is proposed for tackling the problem of time-scale modification of speech, and is worked-out in the context of short-time Fourier transform representations. The resulting WSOLA algorithm produces high quality speech output, is algorithmically and computationally efficient and robust, and allows for on-line processing with arbitrary timescaling factors that may b...

متن کامل

Effects of Pitch Contours Stylization and Time Scale Modification on Natural Speech Synthesis

This paper describes the method of generation of intonated speech for natural speech synthesis using prosody generation model. The effect of pitch modification through pitch contour stylization for parameter extraction and time scale modification for it’s implementation has been mentioned. An approach for close-copy syllabic stylization has been described. In the latter part, algorithm for impl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998